Winnebago County
The Factorization Curse: Which Tokens You Predict Underlie the Reversal Curse and More
Today's best language models still struggle with hallucinations: factually incorrect generations, which impede their ability to reliably retrieve information seen during training. The reversal curse, where models cannot recall information when probed in a different order than was encountered during training, exemplifies this in information retrieval. We reframe the reversal curse as a factorization curse -- a failure of models to learn the same joint distribution under different factorizations. Through a series of controlled experiments with increasing levels of realism including WikiReversal, a setting we introduce to closely simulate a knowledge intensive finetuning task, we find that the factorization curse is an inherent failure of the next-token prediction objective used in popular large language models. Moreover, we demonstrate reliable information retrieval cannot be solved with scale, reversed tokens, or even naive bidirectional-attention training. Consequently, various approaches to finetuning on specialized data would necessarily provide mixed results on downstream tasks, unless the model has already seen the right sequence of tokens. Across five tasks of varying levels of complexity, our results uncover a promising path forward: factorization-agnostic objectives can significantly mitigate the reversal curse and hint at improved knowledge storage and planning capabilities.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > France (0.04)
- Asia > China > Hong Kong (0.04)
- (15 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Show and Tell: Prompt Strategies for Style Control in Multi-Turn LLM Code Generation
Language models generate functionally correct code that tends toward excessive verbosity, with elaborate documentation and defensive patterns that diverge from human baselines. Two prompting mechanisms have emerged for stylistic control: instruction based prompts that articulate abstract directives, and example based prompts that provide concrete code demonstrations. The core problem is whether stylistic constraints persist when models enhance initial implementations with additional features while maintaining high functional accuracy. Here we show that instruction-based, example-based, and combined prompts produce distinct patterns of initial control and expansion discipline over one enhancement turn. We manipulated system prompts across four conditions in a paired two-turn protocol where models first generated solutions to an intermediate Python task, then revised their code under general improvement directives, holding the user task fixed (N = 160 paired programs). Combined prompts produced the strongest initial compression and greatest expansion discipline. Instructions showed large initial effects and moderate expansion discipline. Examples showed modest initial effects with no expansion discipline. These results show that initial prompt effectiveness and expansion discipline are separate aspects of prompt design, and that combined approaches provide the most stable stylistic control in this two-turn workflow.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
A VOIDDS: Aircraft Vision-based Intruder Detection Dataset and Simulator
Finally, we implement a fully-integrated, closed-loop simulator of the vision-based detect-and-avoid problem to evaluate trained models with respect to the downstream collision avoidance task. This benchmark will enable further research in the design of robust machine learning systems for use in safety-critical applications.
- North America > United States > California > Santa Clara County > Palo Alto (0.15)
- North America > United States > Wisconsin > Winnebago County > Oshkosh (0.04)
- North America > United States > Nevada > Washoe County > Reno (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Transportation > Air (1.00)
- Aerospace & Defense > Aircraft (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Hungary > Veszprém County > Veszprém (0.04)
- Asia > China > Hong Kong (0.04)
- (15 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
A VOIDDS: Aircraft Vision-based Intruder Detection Dataset and Simulator
Finally, we implement a fully-integrated, closed-loop simulator of the vision-based detect-and-avoid problem to evaluate trained models with respect to the downstream collision avoidance task. This benchmark will enable further research in the design of robust machine learning systems for use in safety-critical applications.
- North America > United States > California > Santa Clara County > Palo Alto (0.15)
- North America > United States > Wisconsin > Winnebago County > Oshkosh (0.04)
- North America > United States > Nevada > Washoe County > Reno (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Transportation > Air (1.00)
- Aerospace & Defense > Aircraft (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
The Factorization Curse: Which Tokens You Predict Underlie the Reversal Curse and More
Kitouni, Ouail, Nolte, Niklas, Bouchacourt, Diane, Williams, Adina, Rabbat, Mike, Ibrahim, Mark
Today's best language models still struggle with hallucinations: factually incorrect generations, which impede their ability to reliably retrieve information seen during training. The reversal curse, where models cannot recall information when probed in a different order than was encountered during training, exemplifies this in information retrieval. We reframe the reversal curse as a factorization curse - a failure of models to learn the same joint distribution under different factorizations. Through a series of controlled experiments with increasing levels of realism including WikiReversal, a setting we introduce to closely simulate a knowledge intensive finetuning task, we find that the factorization curse is an inherent failure of the next-token prediction objective used in popular large language models. Moreover, we demonstrate reliable information retrieval cannot be solved with scale, reversed tokens, or even naive bidirectional-attention training. Consequently, various approaches to finetuning on specialized data would necessarily provide mixed results on downstream tasks, unless the model has already seen the right sequence of tokens. Across five tasks of varying levels of complexity, our results uncover a promising path forward: factorization-agnostic objectives can significantly mitigate the reversal curse and hint at improved knowledge storage and planning capabilities.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Hungary > Veszprém County > Veszprém (0.04)
- Asia > China > Hong Kong (0.04)
- (15 more...)
- Research Report > New Finding (0.66)
- Research Report > Strength High (0.54)
- Research Report > Experimental Study (0.54)
AVOIDDS: Aircraft Vision-based Intruder Detection Dataset and Simulator
Smyers, Elysia Q., Katz, Sydney M., Corso, Anthony L., Kochenderfer, Mykel J.
Designing robust machine learning systems remains an open problem, and there is a need for benchmark problems that cover both environmental changes and evaluation on a downstream task. In this work, we introduce AVOIDDS, a realistic object detection benchmark for the vision-based aircraft detect-and-avoid problem. We provide a labeled dataset consisting of 72,000 photorealistic images of intruder aircraft with various lighting conditions, weather conditions, relative geometries, and geographic locations. We also provide an interface that evaluates trained models on slices of this dataset to identify changes in performance with respect to changing environmental conditions. Finally, we implement a fully-integrated, closed-loop simulator of the vision-based detect-and-avoid problem to evaluate trained models with respect to the downstream collision avoidance task. This benchmark will enable further research in the design of robust machine learning systems for use in safety-critical applications. The AVOIDDS dataset and code are publicly available at https://purl.stanford.edu/hj293cv5980 and https://github.com/sisl/VisionBasedAircraftDAA respectively.
- North America > United States > California > Santa Clara County > Palo Alto (0.35)
- North America > United States > Wisconsin > Winnebago County > Oshkosh (0.04)
- North America > United States > Nevada > Washoe County > Reno (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Transportation > Air (1.00)
- Aerospace & Defense > Aircraft (1.00)
Inferring Inference
Raju, Rajkumar Vasudeva, Li, Zhe, Linderman, Scott, Pitkow, Xaq
Patterns of microcircuitry suggest that the brain has an array of repeated canonical computational units. Yet neural representations are distributed, so the relevant computations may only be related indirectly to single-neuron transformations. It thus remains an open challenge how to define canonical distributed computations. We integrate normative and algorithmic theories of neural computation into a mathematical framework for inferring canonical distributed computations from large-scale neural activity patterns. At the normative level, we hypothesize that the brain creates a structured internal model of its environment, positing latent causes that explain its sensory inputs, and uses those sensory inputs to infer the latent causes. At the algorithmic level, we propose that this inference process is a nonlinear message-passing algorithm on a graph-structured model of the world. Given a time series of neural activity during a perceptual inference task, our framework finds (i) the neural representation of relevant latent variables, (ii) interactions between these variables that define the brain's internal model of the world, and (iii) message-functions specifying the inference algorithm. These targeted computational properties are then statistically distinguishable due to the symmetries inherent in any canonical computation, up to a global transformation. As a demonstration, we simulate recordings for a model brain that implicitly implements an approximate inference algorithm on a probabilistic graphical model. Given its external inputs and noisy neural activity, we recover the latent variables, their neural representation and dynamics, and canonical message-functions. We highlight features of experimental design needed to successfully extract canonical computations from neural data. Overall, this framework provides a new tool for discovering interpretable structure in neural recordings.
- North America > United States > Wisconsin > Winnebago County > Menasha (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > Middle East > Jordan (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Government (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- (3 more...)
Doroni Aerospace Announces New Crowdfunding Campaign on StartEngine
Doroni Aerospace, Inc. ("Doroni") announces the launch of their new crowdfunding campaign on the equity crowdfunding platform. The company previously closed their first Reg CF raise on StartEngine on April 29, 2022, having officially raised $1,069,850 from 916 investors. Now, the company has its sights set on a $2M offering max and is offering investors 50% Bonus Shares of Preferred Stock for the first 3 days the campaign is live as part of a limited-time, welcome back promotion. Doroni CEO/Founder Doron Merdinger is also inviting long-time supporters as well as new investors to join the team for an exclusive welcome back webinar on Wednesday, July 20th 3PM EST. Doron will be providing an overview of the company, current development progress of the H1 eVTOL, and will be answering questions in addition to offering a glimpse at what's next for the company.
- Transportation > Air (1.00)
- Aerospace & Defense > Aircraft (0.72)
- Government > Regional Government > North America Government > United States Government (0.31)
People
Problem decomposition and theory reformulation, integrated cognitive architectures for autonomous robots, distributed constraint satisfaction problems, semigroup theory and dynamical systems, category theory in software design. Interests include machine learning, approximation algorithms, on-line algorithms and planning systems. Calvin, William H. – Theoretical neurophysiologist and author of "The Cerebral Code", and "How Brains Think". Gesture and narrative language, animated agents, intonation, facial expression, computer vision. Intersection of computer science and game theory, computer science and economics, multiagent systems, automated negotiation and contracting.
- North America > United States > California (0.29)
- North America > Canada > Ontario > Toronto (0.15)
- North America > United States > North Carolina (0.05)
- (32 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)